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Abstract 

The degrees of freedom (DoF) of the two-user Gaussian multiple-input and multiple-output (MIMO) 
broadcast channel with confidential message (BCC) is studied under the assumption that delayed channel 
state information (CSI) is available at the transmitter. We characterize the optimal secrecy DoF (SDoF) 
region and show that it can be achieved by a simple artificial noise alignment (ANA) scheme. The 
proposed scheme sends the confidential messages superposed with the artificial noise over several time 
slots. Exploiting delayed CSI, the transmitter aligns the transmit signal in such a way that the useful 
message can be extracted at the intended receiver but is completely drowned by the artificial noise at the 
unintended receiver. The proposed scheme can be interpreted as a non-trivial extension of Maddah-Ali 
Tse (MAT) scheme and enables us to quantify the resource overhead, or equivalently the DoF loss, to 
be paid for the secrecy communications. 
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I. Introduction 

We consider the two-user Gaussian multi-input multi-output broadcast channel with confidential mes- 
sages (MIMO-BCC), where the transmitter sends two confidential messages to receivers A and B, 
respectively, while keeping each of them secret to the unintended receiver. By letting m, n&, and ns denote 
the number of antennas at the transmitter, receiver A, and receiver B, respectively, the corresponding 
channel outputs are given by 

yt=H t Xt + e t , (la) 

Zt = G t x t + bt, t = l,2,...,n, (lb) 

where (y t , z t ) denotes the observations at the receiver A and B, respectively, at time instant t; Ht G !K C 
C n * xm ,Gt € S C C nBXm are the associated channel matrices; (et,bt) are assumed to be independent 
and identically distributed (i.i.d.) additive white Gaussian noises ~ Ne(0, 1); the input vector Xt € C mx 
is subject to the average power constraint 

1 71 

-V tr(x,x, H ) < P. (2) 
n ^— ' 

t=l 

Furthermore, as in 12, we assume any arbitrary stationary fading process such that H t and Gt are 
mutually independent and change from an instant to another one in an independent manner. Note that the 
channel at hand boils down to the conventional Gaussian MIMO wiretap channel where the transmitter 
wishes to send one message to the intended receiver while keeping it secret to the other one, namely, 
the eavesdropper. 

The secrecy capacity region of the two-user MIMO Gaussian BCC with perfect channel state informa- 
tion at transmitter (CSIT)and receivers has been characterized in |2] (see also references therein). As a 
special case, the Gaussian MIMO wiretap channel has been extensively studied in (3]-0. However, the 
secrecy capacity of the MIMO Gaussian wiretap channel with general (imperfect) CSI at the transmitter 
remains open. Since a complete characterization of the capacity region in this case is very difficult (if not 
impossible), a number of contributions have focused on the so-called secrecy degrees of freedom (SDoF), 
by capturing the behavior in high signal-to-noise (SNR) regime (see Il8l- |[ri1l and references therein). 
References |[8l- |[T0l investigated the compound models where channel uncertainty at the encoder is 
modeled as a set of finite channel states, while ifTTI investigated the scenario where the transmitter 
knows some temporal structure of the block-fading processes. A fundamental observation is that unless 
two channels enjoy asymmetric statistical propertied, the perfect secrecy cannot be guaranteed under 

'This may be in terms of asynchronous fading variation, different fading speed, number of antennas, etc.. 
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a general CSIT assumption. In other words, if the statistics of the underlying channels seen by both 
receivers are symmetrical, additional side information (not necessarily instantaneous CSIT) is essential to 
ensure a positive SDoF, by introducing some asymmetry at the encoder. As a matter of fact, this reveals 
one of the major limitations of the wiretap model whose performance strongly depends on the quality of 
the channel state information at the transmitter side. Evidently, theoretically addressing CSI issues is of 
fundamental impact for secrecy systems. 

Recently, in the context of multi-antenna broadcast channel, the pioneering work [|T| showed that 
completely outdated channel state information at the transmitter is still very useful and increases the 
degrees of freedom of the multi-user channel. Motivated by this exciting result, the new assumption, 
commonly referred to as delayed CSIT, has since been applied to several multi-user settings, including 
the MIMO broadcast channel, X channel, and interference channel |[T2l - irT5l . Non-trivial gain of degrees 
of freedom have been shown in all these settings with delayed CSIT. The main idea behind the utility of 
delayed CSIT can be best described with the term "retrospective interference alignment" introduced in 
|[T3l and [16]. That is, the knowledge of causal channel state is used to align the interference between 
users into a spatial/temporal subspace with a reduced dimension at each receiver. 

In this paper, we study the impact of delayed CSIT on the secrecy degrees of freedom in a MIMO 
broadcast channel. In our setting, delayed CSI of a given receiver is available both at the transmitter 
and the other receiveiQ whereas each receiver knows its own instantaneous channel. Such a scenario 
is of practical interest since the receivers may send their channel states to the transmitter via delayed 
feedback links that may be overheard by the other receivers. We first characterize the optimal SDoF of 
the Gaussian MIMO wiretap channel with delayed CSIT. It is shown that delayed CSIT can significantly 
improve the SDoF, provided that the number of transmit antennas is larger than that of receive antennas, 
i.e., m > max(nA, h-b). In this case, we prove that a simple artificial noise alignment (ANA) scheme 
achieves the optimal SDoF. The proposed scheme sends the confidential symbols embedded by the 
artificial noise in such a way that the artificial noise is aligned in a subspace at the legitimate receiver 
while it fills the full signal space at the eavesdropper. The case of partial knowledge where the transmitter 
has delayed CSI only on the legitimate channel is also investigated. In this case, we show that a strictly 
smaller SDoF is achieved compared to the case with delayed CSIT on both channels. Then, we consider 
the two-user Gaussian MIMO-BCC and characterize the optimal SDoF region. The achievability follows 

2 Unless it is explicitly mentioned, we assume that delayed CSI of both channels is available to the transmitter, i.e., it observes 
i2" _1 and G t_1 for every t = 1, 2, . . . . 
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from an artificial noise alignment scheme adapted to convey two confidential messages. The proposed 
scheme can be seen as a non-trivial extension of the Maddah-Ali Tse (MAT) scheme. A simple comparison 
with the MAT scheme enables us to quantify the resource overhead, or equivalently the DoF loss, to be 
paid to guarantee the confidentiality of messages. Although delayed CSIT is found beneficial for a large 
range of transmit antennas analogy to the conclusions drawn for other network systems without secrecy 
constraints ID, lfl3l . lfl4l . we remark that the lack of perfect CSIT significantly degrades the performance 
of the secrecy systems. 

The rest of the paper is organized as follows. Section [TT] introduces the assumptions and some useful 
lemmas while Section HIT] summarizes our main results on the optimal SDoF. Sections JV] and [V] are 
devoted to proof of the main theorems. Finally, the paper is concluded in Section |VTJ with some open 
problems and future perspectives. 

II. Notations, Definitions, and Assumptions 

A. Notations 

Boldface lower-case letters v and upper-case letters M are used to denote vectors and matrices, 
respectively. We use the superscript notation X n to denote a sequence (X\, . . . ,X n ) for any type of 
variables. Matrix transpose, Hermitian transpose, inverse, trace, and determinant are denoted by A T , 
j4 h , A~ l , tr(A), and det(A), respectively. We let diag({j4 t } t ) denote the block diagonal matrix with the 
matrices A t as diagonal elements. Logarithm is in base 2 unless otherwise is specified. The differential 
entropy of X is denoted by h(X). (x) + means max{0,x}. The little-o notation o(logP) stands for any 
real-valued function f(P) such that lim P^-k = 0. The dot equality means the equality on the "pre-log" 
factor, i.e., f(P) = g(P) is equivalent to f(P) = g(P) + o(logP); the dot inequalities > and < are 
similarly defined. 

B. Assumptions and Definitions 

The following assumptions and definitions will be applied in the rest of the paper. 

Definition 1 (channel states): The channel matrices H t and Gt are called the states of the channel at 
instant t. For simplicity, we also define the state matrix St as St = q ■ 

Assumption 2.1 (delayed CSIT): At each time t, the states of the past are known to all terminals. 
However, the instantaneous states H t and Gt are only known to the respective receivers. 

Under these assumptions, we define the code and the optimal SDoF region summarized below. 
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Definition 2 (code and SDoF region): A code for the Gaussian MIMO-BCC with delayed CSIT con- 
sists of: 

• A sequence of stochastic encoders given by 

{F t : W A x W B x JC*- 1 x g*- 1 .— > C m }£ =1 , 

where the messages Wa and Wb axe, uniformly distributed over Wa and Wb, respectively. 

. The decoder A is given by the mapping W A : C nAXn x J{ n x S"" 1 i — ► W A . 

. The decoder B is given by the mapping W B : C neXn x IK™" 1 x 9 n i — ► W B . 
A SDoF pair (c2a, ds) is said achievable if there exists a code that satisfies the reliability conditions at 
both receivers 

lim liminf lp gi W A( n > p )l > d Um limsupPr I Wa ^ yyA = (3) 

P^oo n-¥oo n log P P->oo n _>oo ^ ' 

lim liminf lo gl W B( n > P )l > rf Um i imsup p r Jw B ^ W> B ) = 0, (4) 

P^oo n->oo n log P P->oo n _>oo ^ J 

as well as the perfect secrecy condition 

lim hmsup — = I), (5) 

P^oo n.^oo 71 log P 

lim lim sup = (J. (6) 

P^oo n^oo nlogP 

The union of all achievable pairs (d A , d B ) is called the optimal SDoF region. 

Assumption 2.2 (channel symmetry): At any instant t, the rows of the state matrix St are independent 
and identically distributed. Furthermore, we limit ourselves to the class of fading processes in which the 
state matrix St has full rank min {m, n A + n B } almost surely at any time instant 

As direct consequences of the channel symmetry, we readily have that the marginal distributions of 
any antenna output are equal conditioned on the same previous observations and/or the source message. 
Namely, we have the following property. 

Property 2.1 (channel output symmetry): Let fit = {yxj, • • • , Un A ,t, z i,t, ■ ■ ■ > z n B ,t} be the collection of 
random variables representing all antenna outputs at time instant t. Then, for any subset ujg and lox of 
random variables in fi t satisfying \ = \co%\, we have 

Pr(w a \y t -\z t -\U t ) = Pr^xl^-V- 1 ,^) (7) 

for any random variables Ut O (H t , Gt, W) O fit with t = {1, . . . , n} that form a Markov chain. 



This assumption is used to prove the achievability although the converse proof does not need such an assumption. 
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Using the fact that current channel outputs do not depend on the future channel realizations, we can 
easily show that Property 12. II also holds when we add the conditioning on S n , namely, 

h(u> 3 \y t - 1 ,z t - 1 ,S n ,W) = h(u lx \y t -\z t - 1 ,S n ,W). (8) 

In the following, we omit the conditioning on S n for notation simplicity. 

C. Preliminaries 

For sake of clarity, we collect the results that will be used repeatedly in the rest of the paper. First, 
the following lemma is the direct consequences of the channel output symmetry. 

Lemma 1 (properties of channel symmetry): The following inequalities hold under the channel output 
symmetry Property 12.11 

min{m, n A + n B } h(z n ) > n B h(y n ,z n ), (9a) 
min{m, n A + n B } h{y n ) > n A h(y n , z n ), (9b) 
min{m, n A + n B } h(z n ) > n B h(y n ), (9c) 
min{m,n A + n B } h(y n ) > n A /i(z n ). (9d) 

Furthermore, same inequalities hold true conditional on W. 

Proof: The first two inequalities are proved in Appendix |A] To prove (l9cl ), from d9al ), we have 



h(z n ) > . nB My n ,z n ) (10) 

mm{m, n A + n B j 

> ■ r 77-6 ftfrn), (11) 

mm{ro, n A + n B \ 

where the last inequality comes from the fact that h(z n \ y n ) > h(z n \ y n ,x n ) = h(b n ) = o(logP). Same 
steps can be applied to obtain d9db . ■ 
Then, all the achievable DoF results are essentially based on the rank of the channel matrices. 
Lemma 2: For any matrix A which does not depend on P, we have 

r logdet(I + ^^ H ) UA , n ~ 
lim ; -=mnk(A). (12) 

P^oo logP V ' V ' 

Proof: Let (<ti, . . . , a r ) be the r = rank(^4) non-zero singular values of A. Then, we have that 

r 

logdet(I + PAA H ) = ^2 log(l + Pal) = r log P, 

k=l 

since the non-zero singular values do not depend on P and thus do not vanish with P either. ■ 
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III. Main Results 

In this section, we highlight our main results on the optimal SDoF of the Gaussian MIMO wiretap 
channel and then on the more general Gaussian MIMO broadcast channel with confidential messages. 
We shall interpret the results through comparisons and numerical evaluations. 

A. Wiretap Channel 

Theorem 1 (wiretap channel with delayed CSIT): In presence of delayed CSIT on both the legitimate 
channel and the eavesdropper channel, the optimal SDoF of the Gaussian MIMO wiretap channel with 
m,nA,ne antennas at the transmitter, the legitimate receiver, the eavesdropper, respectively, is given by 

0, to < ne, 

m — tt-b, n& < m < tia, 



d s (n A ,n B ,m) 



n^mim — n&) 

max{?iA, ns\ < m < tt-a + n B, 



^a^b + m(m — uq 
n A (n A + n B ) 



to, > n/\ + riB- 



In the wiretap setting, it is not always reasonable to assume any CSI on the eavesdropper channel 
at the transmitter side. In this case, we may consider delayed CSIT only on the legitimate channel and 
without CSIT on the eavesdropper channel. With this asymmetric CSI assumption, hereafter referred to 
as delayed partial CSIT, we can show that a strictly positive yet smaller SDoF than delayed CSIT on 
both channels is still achievable for a wide range of number of antennas. 

Theorem 2 (wiretap channel with delayed partial CSIT): In presence of delayed partial CSIT, either 
on the legitimate channel or the eavesdropper channel, the following SDoF is achievable for MIMO 
Gaussian wiretap channel for to > max{nA,ne} 

' ra A (TO - re B ) 



d partial( 



n A ,n B ,TOj 



-, maxjriA, no] < to < tt-a + ^b, 



n A + n B 



m 

m > riA + riQ. 



(14) 



Note that it is the best known achievable SDoF in this setting, although the converse is yet to be proved. 

In order to quantify the benefit of delayed CSIT, we summarize the SDoF with perfect, delayed and 
without CSIT in Table Hand provide an example with n/\ = 3, = 2 in Fig. [TJ We remark that delayed 
CSIT is beneficial only when the number of transmit antennas is larger than the number of receive 
antennas, i.e., to > max{nA,nB}, since the SDoF is (to — ?ib) + for to < max{?iA,nB} with perfect, 
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delayed, and without CSIT. As the number m of transmit antennas increases, the SDoF grows until 
m = n/\ + us for perfect and delayed CSIT while it does not increase with m without CSIT. It appears 
that with both perfect and delayed CSIT, we cannot exploit any gain for m beyond n/\ + nQ. Furthermore, 
we remark that delayed CSIT only on the legitimate channel incurs a non-negligible loss compared to 
delayed CSIT on both channels. This is because the transmitter without CSI on the eavesdropper channel 
cannot access to the signal overheard by the eavesdropper, which reduces the signal dimension to be 
exploited by the legitimate receiver. 

TABLE I 

Comparison of the SDoF under different CSIT assumptions for m > max{n A , n B }. 



CSIT 


max{n A , ne} < m < n A + n B 


m > »ia + riB 


perfect 


VCl — TlQ 


n A 


delayed 


n^m(m-ng) 


"a("a+"b) 


n^nQ+m(m— n B ) 


n A + 2n B 


delayed partial 


n A (m-n B ) 




m 


n A +n B 


no 


(n A - n B )+ 


(n A - n B ) + 



Comparison of SDoF with n A =3, n fi =2 

3.5 

3 
2.5 
2 

(l< 
o 
Q 

CO 

1.5 
1 

0.5 

°2 3 4 5 6 7 8 

Number m of transmit antennas 

Fig. 1. SDoF with n A = 3 and n& = 2 with perfect, delayed, and no CSIT. 




B. Broadcast channel with confidential messages 

Next, we present the achievable SDoF region of the two-user MIMO-BCC with delayed CSIT. 
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Theorem 3 (BCC with delayed CSIT): The optimal SDoF region ft B cc of the two-user MIMO-BCC 
with delayed CSIT is given as a set of non-negative (dp,d B ) satisfying 

d k d B 



d s (np,n B ,m) min{m, np, + n B } 
dp d B 



< 1, 



< 1, 



(15a) 
(15b) 



min{m, n A + n B } d s (n B ,np,m 

for any m > max{nA, n B }. If n B < m < np, we have dp, < m—n B and d B = 0, whereas if np < m < n B , 

we have dp = and d B < m — np. 

Corollary 3.1: For the case m > max{nA,ne}, the SDoF region is characterized by the two corner 

points (0, d s (riB, np, m)), (d s (np,n B ,m),0) and the sum SDoF point given by 

f fnp(m-n B ) n B (m - np, . . 

- I , maxjriA, n B \ < m < np + n B 

' (16) 

m > np + n B . 



(dp, d E 



m 



)i7 



B 



np + n B np + n B 



Remark 3.1: We can find trivial upper bounds to the above SDoF region for the case of m > 
max{n A , n B }. On one hand, the SDoF region with delayed CSIT is dominated by the SDoF region with 
perfect CSIT The SDoF region with perfect CSIT is square connecting three corner points (min{n A ,m — 
n B },0), (mm{np,m — 7iB},min{m — np,n B }), and (0,min{m — np,n B }). We can also compare the 
above SDoF region with delayed CSIT and the DoF region of the two-user MIMO-BC with delayed 
constraint 1 1 2|. given by 

dp. d B 



mm{m, np\ mmjm, np + n B \ 

dp < j 



(17a) 
(17b) 



min{m, np + n B } min{m, n B } 
Obviously, since SDoF is always upper bounded by DoF of the MIMO channel, namely d s (np, n B , m) < 
mm{np,m} and d s (n B ,np,m) < min{nB,m}, the SDoF region is dominated by the DoF region. 
We provide an insight to the proposed artificial noise alignment scheme which achieves the sum SDoF 



l 



over the two-user MISO-BCC. Let us consider the four-slot scheme where the transmitter 



= [ui U 2 ] T , Vp = [Vu V12V , V B = h>21 V 2 2\ T 



point v 2 , 2 

sends six independent Gaussian distributed symbols u 
whose powers scale equally with P. Specifically, the transmit vectors are given by 

(gJvA + g2ih[u) + (hJv B + h 31 gju) 




X\ =U, X2 = Vp + 



hju 




gju 




,a?3 =v B + 


,354 = 













, (18) 



where, for simplicity of demonstration, we omit the scaling factors that fulfill the power constraint ([2]). 
Note that this simplification, also adopted in [ 1 ] and other related works, does not affect the high SNR 
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perfect CSIT 




Fig. 2. The two-user DoF/SDoF region with m = 5, tia = 3, ne = 2. 

analysis carried out here. The following remarks are in order. First, it can be easily shown that, at 
receiver A, v& lies in a two-dimensional subspace, while the unintended signal vq plus the artificial noise 
are aligned in another two-dimensional subspace. Thus, the intended message can be recovered through 
V/s, from the four-dimensional observation at receiver A. Second, v& is drowned in the observation at 
receiver B. More precisely, at receiver B, V/\ is squeezed into a one-dimension subspace filled with artificial 
noise, which makes it impossible to recover any useful information about A. Due to the symmetry, the 
same holds for vq. Therefore, we can send simultaneously two confidential symbols to each receiver over 
four slots, yielding the sum SDoF point (i, i)- 

The four-slot scheme contains two special cases of interest. If we consider the MISO wiretap channel 
where the transmitter wishes to convey «a to receiver A while keeping it secret to receiver B, we let 
Vb = and ignore the third time slot. This provides a SDoF of |. If we consider the two-user MISO-BC 
without secrecy constraint, we remove the artificial noise transmission by letting u = and ignoring the 
first time slot. This boils down to the MAT scheme [lj. The four-slot scheme as well as the more general 
artificial noise alignment scheme presented in Section [V] is indeed a non-trivial extension of retrospective 
interference alignment schemes for MIMO broadcast channels H), fT2l to secure communications. The 
comparison with the three-slot MAT scheme can be interpreted as follows. The messages can be kept 
secret at a price of an additional resource (one slot), which appears as a DoF loss with respect to the 
communication systems without secrecy constraint. 
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In order to visualize the DoF loss due to the secrecy constraints, we provide an example of the 
achievable DoF/SDoF regions with m = 5, n/\ = 3, and rae = 2 in Fig. |2] For the case of perfect CSIT, 
the SDoF region and the DoF region are square. In the MIMO-BC, we send (n^n^+no), n^n^+no)) = 
(45, 20) private symbols to receiver A and B, respectively, over a duration of n\ + + n^ne = 19 slots, 
yielding the DoF (f§,f§), as shown in lfl~2l . Under the perfect secrecy constraints, we need an extra 
phase of the artificial noise transmission of n^riQ = 6 slots to convey two streams securely. This yields 
the SDoFof (§,§). The comparison with the DoF region of the MIMO-BC can be interpreted in either 
an optimistic or a pessimistic way. On one hand, the benefit of delayed CSIT is more significant for the 
SDoF region. On the other hand, we also observe that the lack of accurate CSIT decreases substantially 
the SDoF, which implies that the secure communications are very sensitive to the quality of CSIT. 

IV. Wiretap Channel: Proofs of Theorems [Hand [2] 

A. Converse proof of Theorem [7] 

We are now ready to provide the converse by considering different cases below. 
1) Case m < ne-' From Fano's inequality and the secrecy constraint, we have 

n{R - o(log P)) < I(W; y n ) - I(W; z n ) (19) 

= I(W;y n \z n )-I(W;z n \y n ) (20) 

= h(y n | z n ) - h{y n | z n , W) - I(W; z n \ y n ) (21) 

< h(y n | z n ) (22) 

n 

< HVt I z t ) (23) 
t=i 

n 

= Y J h (yt-H t x z , t \z t ) (24) 

t=i 

n 

= J2 h (et-H t (x Zit -Xt)\z t ) (25) 

t=i 

n 

= ^2 h(e t - H t (x Zit - x t )) (26) 
t=i 

= no(logP), (27) 

where (1221 is from the fact that both I(W;z n \ y n ) and h(y n \z n ,W) are non-negative; in (1241) we use 
the fact that translations preserve differential entropy and let x z j denote the MMSE estimation of Xt 
given Zt, the last equality holds because the estimation error does not scale with P. 
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2) Case ng < m < max {ha, ub}: Since this case happens only when ra B < m < n&, we can assume 
m < n A . Starting from ((22)) . we have 

n{R - o(log P)) < h{y n \ z n ) (28) 

< min{m,n A + n B }-n B 

n B 

< n(m — n B ) log P + no(log-P), (30) 



where d29l ) follows straightforwardly from d9a| ); the last inequality comes from the fact that i.i.d. Gaussian 
variables maximize the differential entropies under the variance constraint. 

3) Case m > max{n/\, ne}-' In the following we let m = mm{m, n A + kb} for notation simplicity. 
We remark that two upper bounds can be obtained as a direct consequence of Lemma Q] One one hand, 
(T29]) still holds 

I(W;y n ) - I(W;z n ) < m ~ nB h{z n ). (31) 

n B 

On the other hand, we have 

I{W- y n ) - I(W; z n ) = h(y n ) - h{y n | W) - h(z n ) + h{z n \ W) (32) 

< h(y n ) + ( 1 - — ) h(z n | W) - h{z n ) (33) 

< h(y n ) - ^h(z n ), (34) 

m 

where d33l follows from (l9dl ): (1341 follows from h(z n \ W) < h(z n ); By combining the above two upper 
bounds, we readily have 

n(R - o(log P)) < I(W; y n ) - I(W; z n ) (35) 

< min ( m - nB h(z n ),h(y n ) - — h{z n )\ (36) 

I n B m J 

. Jm-ns n A \ 

< max max mm < p,a p > (37) 

a [ n B m J 

. A . n A n B 

< max a 1 H (38) 

a \ rh(rh — n B ) / 

< fl+ - " AnB ^ 'nA^logP, (39) 
\ m{m — n B J / 

where (1371 ) is because the RHS of d36l ) can only increase by maximizing it over both entropies a = 
h{y n ),fi = h{z n ); in (I38T ) the inner maximization is solved by equalizing two terms inside min, and 
finally we use h(y n ) < n A n log P + o(logP). This establishes the converse proof. 
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B. Achievability proof of Theorem \J} 

In the following, we wish to show the achievability of the SDoF. As in the converse part, we consider 
separately the cases for different m. Note that only two ranges of m need to be considered. The first 
one is hq < m < max{%, ns} and the other one is max{n/\,nB} < m < + uq. For m < hq, the 
SDoF is zero. For m > n& + tlq, the converse shows that it is useless in terms of SDoF to set more than 
n/\ + Tie antennas. 

1) Case ng < m < maxjn^, ne}: For this case, we need to show that d s (n/\, uq, m) = m — uq is 
achievable for ns < m < n/\. This can be simply done by sending a vector of m symbols of which 
m — rie symbols v are useful message and the other ne symbols u are artificial noise (or a random 
message). The legitimate receiver can decode all m symbols and therefore extract the useful message, 
i.e., 

Kv.y) = Kv.u.y) - I(u:y\v) (40) 



loo d ei ( I nA + -HH H ) - logdet (l nA + ^-H 



ml \ m 



i„ B 



o m _„ B 



H"^j (41) 

= (m — tib) log P, (42) 

while the eavesdropper channel is inflated by the random message and does not expose more than a 
vanishing fraction of the useful message, i.e., 

I(v;z) = I(v,u;z) — I(u;z\v) (43) 

= logdet (l nB + -GG H ) - logdet (l ng + ] G") (44) 

= uq log P — uq log P (45) 
= 0, (46) 

where we used the fact that rank(iT) = m and rank(G) = uq. Note that (|4~TT) and (l44l are obtained by 
applying independent Gaussian signaling to v and u with proper covariance corresponding to the power 
constraint. This assumption will be implicitly applied in the rest of the paper. 

2) Case maxjn^^e} < m < ua + ree-' The proposed scheme combines the artificial noise with the 
Maddah-Ali Tse (MAT) alignment scheme |1|. The main idea of this scheme is to send the artificial 
noise such that it fills the eavesdropper's observation and hides the confidential message, while it shall 
be aligned in a reduced subspace at the legitimate receiver. The proposed three-phase scheme is presented 
in Table QI1 where the signal model without thermal noise is described concisely with the block matrix 
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TABLE II 

Proposed three-phase scheme for max{n A , n B } < m < n A + n B . 



Phase 1 (t 6 Ti) 


Phase 2 (t €7 2 ) 


Phase 3 (t £ T 3 ) 


Xi — u 
y 1 = H!U 
Zi = G±u 


x 2 = v + &y 1 
y 2 = H 2 v + H 2 @H 1 u 
2 2 = G 2 v + G 2 @H 1 u 


x s = &z 2 
y 3 = H 3 <f>G 2 v + H z ^G 2 ®H lU 
z 3 = Gi®G 2 v + G 3 $G 2 ®H lU 



TABLE III 

Length of three phases {n} for different m, n A , m- 





max{n A , «b} < m < n A + tib 


m > n A + jib 


Tl 


n A n B 


n A n B 


T 2 


n A (m — ns) 


n A 


T3 


(m - 7i A )(m — n B ) 


n A n B 


total duration Ysi—i Ti 


n A na + m(m — ns) 


2n A ne + n A 



notation: 

«=[«7 ... «JJ T eC mTlXl , u=[ w T ... v T] T GC m ^ xl , (47) 

Hi = diag ({fr t }tg T( ) € C* T < xmr ' , Gj = diag ({G t } teTs ) G C n °™ , (48) 

where Tj denotes the length of phase i for i = 1, 2, and 3 given in Table HTT1 
The three phases are explained as follows: 

• Phase 1, t 6 7\ = {1, . . . , Ti }: sending the artificial noise. The mri symbols sent in n time slots 
is represented by tt. 

• Phase 2, i E T2 = {ri + 1, . . . ,ri +T2}: sending the confidential symbols with the artificial noise 
seen by the legitimate receiver. In T2 time slots, we send the mT2 useful symbols represented by 
v, superimposed by a linear combination (specified by 0) of the artificial noise observed by the 
legitimate receiver in phase l[] 

• Phase 3, t G T3 = {n + t% + 1, . . . , t% + t% + T3}: repeating the eavesdropper's observation 
during phase 2. The final phase consists in sending a linear combination (specified by <&) of the 

4 As mentioned before, we ignore the scaling factor necessary to meet the power constraint. The same holds for the transmit 
vector in phase 3. 
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eavedropper's observation in phase 2. The aim of this phase is to complete the equations for the 
legitimate receiver to solve the useful symbols v without exposing anything new to the eavedropper. 
After three phases, the observations are given by 



V 







H 2 @ n 2 

H 3 ®G 2 @ H 3 $G 2 



niT 2 

Ho 



Hiu 
v 



(50a) 



H 



G 2 GH l 



j tl b t 2 



'-n B T 2 



G 3 ^G 2 &H 1 G 3 * 



u 



G 2 v 



+ b. 



(50b) 



Therefore, we have 



I(v) y) = I(v, Hiu; y) - I{Hiu; y \ v) 



rank(H) log(P) — rank 

n/\Ti + rank ^ 

* log(P) 



H 2 ® 

H 3 G>G 2 ® 



H 2 
H 3 $G 2 



log(P) 
log(P) - n A rilogP 



rank 



Hi 



H 3 &G 2 



mnh(m — tlb) log P, 



(51) 
(52) 

(53) 
(54) 
(55) 



where (l53l follows due to the block-triangular structure of H and by the fact that the rank of the second 
term conesponds to the rank of the identity matrix. In order to prove the last equality, we need to show 
first that the submatrix H 3 &G 2 has full-row rank with linearly independent n^T^ rows. This is satisfied 
by letting 



(56) 



diag({* t }[i 1 ) mr3X(nBr2 _ nAT3) j 

where II is a permutation matrix such that the first n/\T 3 rows of II T G 2 , is block diagonal, denoted by 
diag ({G^} 4e -j 2 ), where G n t £ c' m_,lA ' xm is a submatrix of G 2) t\ 3>t denotes a m x matrix with 



riA independent columns, e.g., & t 
resulting submatrix is given by 



[ln A 0n. A x(m-n A )] 



. Note that with this particular choice of the 



H&G 2 = diag({ J ff t+Tl+T2 * t }^ 1 ) diag ({G&Wj 



(57) 
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Since the first matrix in the right hand side of (l5Tb is square and full-rank, it is easy to see that 

rank ( ) = ran k (diag({Gf } 7 ) )' ^ ^ e row P ermutat i° n ' we can readily show that the latter 

has a desired rank of mn^m — n&). Namely, 

( H \ Mm-n B ) \ 

rank " = > rank ' (58) 

ydiag ({G^JteT 2 ) J ti \G% t ) 

= n/\{m — ne)m, (59) 

where the last equality follows by noticing that each block t corresponds to m different rows of the state 
matrix St which are linearly independent from Assumption 12.21 On the other hand, the eavesdropper's 
observation is filled by the artificial noise and thus does not expose more than a vanishing fraction of 
the useful message, i.e., 

I(v;z) <I(G 2 v;z) (60) 
= I(G 2 v,u;z)-I(u;z\G 2 v) (61) 

= n B (n + t 2 ) log P - rank ( G 2 ©# i j log P (62) 

= mn A n B log P - rank ^ ^ ^ J log P (63) 

= 0, (64) 

where (l60l follows due to the Markov chain v o G 2 v -B> z; (l62l follows by noticing that the rank of G 
is determined by the submatrix corresponding to first two phases; (l63l follows because the third block 
row is a linear combination of rows from the second block row. In order to prove the last equality, we 
choose 



&n= [diagtfSt}^) o mriX(nAT1 _ nBT2) 



(65) 



where II is a permutation matrix^ such that the first hqt 2 rows of II T Hi is block diagonal, denoted by 
diag ( {-fffi/teoO > with Ilf t being a (m— tiq) x m submatrix of Hit', ®t denotes a m x uq matrix with 



ne independent columns, e.g., ® t = [l rtB 0„ BX ( m _ nB )] T - By applying exactly the same reasoning as 

on the choice of we can prove that rank ( - ) = mr\ = mni\nQ. As a result, the n^mim — uq) 

\ G 2 ®H i / " 

useful symbols can be conveyed secretly over npjiQ + m(m — uq) time slots in the high SNR regime, 
yielding the SDoF of "Am(m-n B ) 

5 We abuse the notation to denote another permutation matrix than the one used in d56t . 
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C. Achievability proof of Theorem [2] 

In this subsection, we provide the achievability proof for the case of delayed partial CSIT when the 
transmitter has delayed CSI only on the legitimate channel. We focus on the case max{?iA,nB} < m < 
n A + n B- F° r the case of m > + uq, we can easily show that the desired SDoF follows by using only 
nA + ne antennas out of m, i.e., by replacing m by n^ + riQ similarly to the case of delayed CSIT on both 
channels. We propose a variant of the artificial noise alignment scheme described previously. The lack of 
CSIT on the eavesdropper channel requires the following modifications. First, the transmission consists 
of first two phases presented in Table [TTJ because the lack of CSI on the eavesdropper channel does 
not enable the transmitter to repeat the signal overheard by the eavesdropper (corresponding to the third 
phase). Consequently, the confidential symbols v sent during the second phase must be decoded within 
this phase. This decreases the dimension of v from mT 2 to After two phases, the observations are 

given by 



y 



In A Ti 








#2© 


H 2 




V 



+ e, 



(66a) 



G\ 0n B T 2 




U 


G2&H1 I mT2 




G 2 v 



+ b. 



(66b) 



Following similar steps as before and choosing in (1651 ). we can easily show that 

I(v;y) = nl(m-n B ) log P, (67) 
I(v.z) = 0. (68) 
As a result, the n^(m — rie) useful symbols can be conveyed secretly over n^m time slots in the high 



SNR regime, yielding the SDoF of 



n A (m— n B ) 
m 



V. Broadcast Channel with Confidential Messages: Proof of Theorem[3] 
A. Converse 

We focus on the case m > max{riA, uq] in the following. The converse for the other cases is trivial 
from Section[lV] The secrecy constraint (f5]) together with Fano's inequality for Wq, i.e., h{WQ\z n ) < ne, 
yields 

I(W A ;z n \W B ) < rao(logP). (69) 
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Similarly to the converse of the MIMO wiretap channel, we obtain two upper bounds on R A . The first 
bound is obtained by combining (l69l ) with Fano's inequality on W A , i.e., h(W A \y n ) < ne, 

n(R A - o(logP)) < I(W A ;y n \W B ) - I{W A] z n \W B ) 

<I(W A ;y n \z n ,W B ) (70) 
<h(y n \z n ,W B ) (71) 

< ^Z^ h{z n m)i (72) 

n B 

where (70]> follows by I(Wa; V n |Wa) < I(W A ;y n ,z n \W B ); (Z2]> follows from inequality @a) in Lemma 1. 
The second bound is (f34] > which holds also here by replacing W by W A , namely, 

I(W A ;y n ) - I{W A -z n ) < h{y n ) - ^h{z n ). (73) 

m 

Putting the two upper bounds together, we have 

n(R A -o(logP)) < minf^(/|lf B ),/ 1 ( 1/ n )-^' l )l (74) 

{ n B m J 

On the other hand, Fano's inequality for W B leads to 

n(i? B - o(log P)) < h{z n ) - h{z n \W B ). (75) 

Now, we sum inequalities d74l ) and d75l ) with the weight T A = riAne + m(m — n B ), n A (m — n B ), 
respectively. This yields 

n(T A R A + n A (m — n B )R B — oflog P)) < max maxmin {(m — n B )a, T A h(y n ) — -a\ (76) 

h{y n ) a l m ) 

< max rh(rh — n B )h{y n ) (77) 
h(y n ) 

<n A rh(rh — n B )n\og P, (78) 

where we let a = n A h(z n ) + ^^^h(z n \W B ) in the first inequality and the last inequality follows 
because h(y n ) < nn A \ogP + o(logP). By dividing both sides by n A m(rh — n B )logP and letting P 
grow, we obtain the first desired inequality dl5a| ). Due to the symmetry of the problem, (1 1 5bb can be 
obtained by swapping the roles of R A and R B . This completes the converse proof. 

B. Achievability 

The corner points can be achieved by the ANA scheme described in Section [TV] Here, we provide 
a strategy achieving the sum SDoF point. In fact, the ANA scheme for the MIMO wiretap channel 
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TABLE IV 

Proposed four-phase scheme for max{riA, n B } < m < n A + n B . 



Phase 1 


Phase 2 


Phase 3 


Xi = u 


£2 = «a + ®Ai/i 


x 3 = vb + ®bZi 


y 1 = H!U 


J/ 2 = H 2 (v A + ©A-H"l«j 


y 3 = H 3 (y B + ©bGim) 


Z\ = G\U 


z 2 = G 2 (va + A -ffiuJ 


z 3 = G 3 (v B + ©bGim) 


Phase 4 






y 4 = H 4 *a(G 2 «a + G 2 &aHiu) + H 4 ^b(H 3 v b + H 3 & B Giu) 


24 = G 4 *a(G 2 V A + G 2 ®aHiu) + Gi& B (H 3 VB + Hi&BGiU) 



TABLE V 

Length of four phases {n} for different m, n A , and n B - 





maxjriA, 72b} < m < riA + 


m > ha + riB 


Tl 


n A n B 


n A n B 


T'2 


nA(m — ub) 


"A 


T 3 


risim — ha) 


n| 


T4 


(m — jia)(jti — jib) 


n A n B 


total duration 53 — i Ti 


m 2 


(n A + n B ) 2 



in Section [TV] can be suitably modified to convey two confidential messages. We focus on the case 
max{?iA,nB} < m < tt-a + riQ because the converse proof says that we only need to use n/\ + uq 
antennas for the case m > + n&. 

The proposed four-phase scheme is presented in Table [TV] where the signal model is describe concisely 
with the block matrix notation: 



u=[u[ ... <] T eC^ xl , 



£T f = diag({H t } teT< ) G 

A G <C mT2XnATl , 
* A G C mT4XnBT2 , 



r 4 X rriTi 



G 4 = diag({Ga teTi )GC n ^ xm ^, 
©B G C mT3XnBTl , 
*B G 



■«TlT4XnAT3 



(79) 
(80) 
(81) 
(82) 
(83) 



where the durations of four phases {Tj}j are given in Table Ivl The four phases consist of: 

• Phase 1, t G Ti = {1, . . . , n}: sending the artificial noise. The mr\ symbols sent in r\ time slots 
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is represented by u. 

• Phase 2, t G 7 2 = {n + 1, . . . , t\ + t 2 }: sending the confidential symbols Va with the artificial 
noise seen by receiver A. In t 2 time slots, we send the mr 2 useful symbols represented by v&, 
superimposed by a linear combination (specified by ©a) of the artificial noise observed by receiver A 
in phase 1. 

• Phase 3, t € O3 = {t\ + t 2 + 1, . . . , t\ + t 2 + T3}: sending the confidential symbols Vb with the 
artificial noise seen by receiver B. In T3 time slots, we send the 771T3 useful symbols represented 
by Vb, superimposed by a linear combination (specified by ©b) of the artificial noise observed by 
receiver B in phase 1. 

• Phase 4, t G T4 = {n + T2 + T3 + 1, . . . , T\ + T2 + T3 + T4}: repeating the past observations 
during in phase 2 and 3. The final phase consists in sending a linear combination of receiver B's 
observation in phase 2 (specified by <&a) ar >d receiver A's observation in phase 3 (specified by 3>b)- 
The aim of this phase is to complete the equations for the intended receivers to solve the useful 
symbols without exposing anything new about the message to the unintended receivers. 

After four phases, the observations are given by 



V 



Ho 



H 2 & k 







#4* A G 2 #4*aG 2 ©A H 4 








L "ATi 









"-"AT3 



HiVB+Hs&eGm 



+ e, 



(84a) 






63 



Hbcc 



G 3 & B 





'■n B T 2 





G 4 $bH 3 G 4 * b ^3©b G 4 *a 



vb 

G 2 v a + G 2 @aH 1 u 



+ b. 



(84b) 
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First, we examine the mutual information between v A and y: 

I(va- y) = I(v A , Hiu, H 3 v B + H iS B GiU. y) - I(H\U, H 3 v B + H 3 @ B Giu; y \ v A ) (85) 

H 2 ©a 

= rank(H bcc )log(P) -rank I ff A*AG a e» i/ 4 J \ og p (86) 



™A T 1 
I, 



A T 3 



Wa(ti + r 3 ) + rank ( ) ) log P - n A (ri + r 3 ) log P (87) 

rankf - f 2 - Wp (88) 



= mn A (m — n B ) log P, (89) 

where in (l87l) the first term is due to the block-triangular structure of Hb cc and the second term follows 
because the rank corresponds to the rank of the identity matrix; d89l follows by choosing $a given in 
d56l ) where we replace r 3 with r 4 . 

Next, in order to examine the leakage of V/\ to receiver B, we write 

J(«a; z, «b) = ^(wa; z\v b ) (90) 

</(G 2 v A ;zM (91) 

= I(G 2 v A , u; z\v B ) - I(«; z | G 2 v A , «b) (92) 

< I(G\U.G>v A + G- 2 & A H\u:z\v B ) - I(u;z\G 2 v A ,v B ) (93) 

= rank[ ^ \ log P - rank ( f %f \ logP (94) 

= n B (ri + r 2 ) log P - rank ( - ) log P (95) 

= 0, (96) 

where d9~TT) follows from the Markov chain v A <-> G 2 v A <-¥ z; (l92l) is due to another Markov chain 
(G 2 v A ,u) <-> (GiU,G 2 v A + G 2 ® A H\u) <-> z; in ( 1931 ) we notice that two block columns of Gb cc is 
block-triangular and the second term follows by keeping only linearly independent block rows; the last 
equality is obtained by setting A given in d65l ). As a result, the SDoF d A = " A ("^~"- B ) j s achieved with 
the proposed scheme. By symmetry of the problem, we have d B = nB ^~ nA ^ which completes the proof. 

VI. Conclusions and Perspectives 

We studied the impact of delayed CSIT on the MIMO wiretap channel and the MIMO broadcast channel 
with confidential messages (BCC) by focusing on the secrecy degrees of freedom (SDoF) metric. The 
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optimal SDoF region of the two-user Gaussian MIMO-BCC was fully characterized. It is shown that an 
artificial noise alignment (ANA) scheme, which can be regarded as a non-trivial extension of Maddah-Ali 
Tse (MAT) scheme, can achieve the entire SDoF region. The proposed ANA scheme enables to nicely 
quantify the resource overhead to be dedicated to secure the confidential messages, which in turn appears 
as a DoF loss. Although delayed CSIT was found useful to improve the SDoF over a wide range of 
the MIMO system, our study somehow revealed the bottleneck of physical-layer security due to its high 
sensitivity to the quality of CSIT. 

Several interesting open problems emerge out of this work. First, some techniques used for lower- and 
upper-bounding the SDoF in this work may serve to enhance further insights on related problems for 
moderate SNR regimes. Second, the characterization of the SDoF upper bound of the Gaussian MIMO 
wiretap channel with delayed partial CSIT remains open. We emphasize that for the case of partial CSIT, 
the inequalities due to the channel symmetry still hold true, but these do not seem to be enough to 
prove the converse. The challenge consists of finding novel and tighter inequalities that capture some 
new asymmetry between h(z n ) and h(y n ). Finally, the extension to more complex scenarios such as the 
BCC with more than two receivers can be also investigated. 

Appendix A 
Proof of Lemma Q] 

Lemma 3: Let x L = (x\, . . . ,xl) be entropy-symmetric such that h({xj : j G 3}) = h({xk '■ k € 
%}), for any \8\ = \X\ < L. Then, for any M > N, we have 

h(x N+k ) - h{x N ) > h{x M+k ) - h(x M ), V k > 0, (97a) 
and Mh(x N )>Nh(x M ), (97b) 

where we define h{%) = for convenience of notation. 

Proof: For M = N, the inequalities d97a| ) and d97b| ) hold with equality trivially. It is thus without 



loss of generality to assume that M > N. 

We first prove inequality d97a| ). It is readily shown that 

h(x N+k )-h(x N ) = h(x 1 ,...,x N+k )-h(x 1 ,...,x N ) (98) 

= h(xi, . . .,x N+k ) - h(x k+1 , . . .,x N+k ) (99) 

= h(xi,.. .,x k \ x k+1 , . . .,x N+k ), (100) 
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where d99l is from the entropy-symmetry of x L . Since the last equality is decreasing with N > 0, d97a| ) 
is immediate. 

For the inequality d97b| ), we prove it by induction on L. For L = 2, the only non-trivial case is M = 2 
and iV = 1, where we have 

2h{x 1 ) = h(x x ) + h(x 2 ) (101) 

>h(xi,x 2 ). (102) 

Assume that the result holds to L = I - 1, i.e., d97b| ) is true for any (M, AT) e {(j, fc) : I — 1 > j > k}. 
We would like to prove that it holds for any (M, N) £ {(j, k) : I > j > k}. In particular, all we need 
to prove is that the inequality holds for M = I and any N < I — 1, i.e., 

lh{x N ) - Nh{x l ) >0, VJV<Z-1. (103) 

To this end, we first write 

I h(x N ) - N h(x l ) = {l-N) h(x N ) - N (h(x l ) - h{x N )). (104) 
For N such that I — N < N, we can lower-bound the right hand side (RHS) of (11041 ) as 

(Z - N)h(x N ) - N(h{x l ) - h{x N )) > (Z - tyhix") - N(h{x N ) - h{x 2N - 1 )) (105) 

= iV h(x 2N - [ ) - {2N - l)h{x N ) (106) 

> 0, (107) 

where the first inequality is from the fact that applying (I97ab . h(x ) — h(x N ) < h(x ) — h(x 2N ~ l ); the 
last inequality is from the induction assumption, since (N, 2N — I) is such that I — 1 > N > 2N — I. 
For N such that / - N > N, we lower-bound the RHS of (11041 ) differently 

(Z - N)h(x N ) - N(h(x l ) - h{x N )) > (Z - N)h(x N ) - N h{x l ~ N ) (108) 

> 0, (109) 

where the first inequality is from the fact that applying (|97a| ), h(x ) — h(x ) < h(x l ) — Zi(x°) with 
= /i(0) = by definition; the last inequality is from the induction assumption since (Z — iV, iV) is 
such that I - 1 > I - N > N. The proof for ( fT03l in complete. 
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By symmetry of the problem, we only need to prove d9a| ). We first consider the case n/\ + n B < m. 

n 

n B l<(y".z") = llB Y,h(Vf-Zt\v' ') (HO) 



t=i 

n 



= n B J2 h (ut\v t - 1 ,z t - 1 ) (HI) 
t=i 

n 

< (n A + n B )^2h(z t \y t - 1 ,z t - 1 ) (112) 

t=i 

n 

< («A + n B )^^l^ _1 ) (113) 

t=i 

< (n A + n B )/i(z n ), (114) 

where we define u; t = {yti^i}; (II 121 ) is the application of d9ab . 

When m < n/\ + n B , (11121 ) is loose. We tighten the bound as follows. 

n 

"6%", z n ) = n B ^2 h(u t | y'- 1 , z l - v ) (115) 
t=i 

n 

= n B ]T (fc(w t Iv*" 1 ,**- 1 ) + fc(w t la^y-V- 1 )) (116) 
t=i 

n 

< n B h(u t I V" 1 ) + o(log P) (1 17) 

t=i 

n 

<m^/ l (zt|2/*- 1 ,z*- 1 ) + o(logP) (118) 

t=i 

n 

< m^/i(z t | z*- 1 ) + o(logP) (119) 

< mh(z n ) + o(logP), (120) 

where we partition Ut as u>j = {uit, Ut] in such a way that u> t and u> t are of length m and n/\ + n B — m, 
respectively; dl 171 ) is from the fact that h(u>t | a>t, y* -1 , z' -1 ) < /i(o>t |u>t) and that h{6j t \ tt>t) — o(logP) 
with the same reasoning applied in (|23l)-(r27T). 
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